Skip to content

Consolidate all dependencies into a single indexed file (to reduce I/O)#704

Merged
jviotti merged 1 commit intomainfrom
single-deps
Mar 6, 2026
Merged

Consolidate all dependencies into a single indexed file (to reduce I/O)#704
jviotti merged 1 commit intomainfrom
single-deps

Conversation

@jviotti
Copy link
Member

@jviotti jviotti commented Mar 4, 2026

Signed-off-by: Juan Cruz Viotti jv@jviotti.com

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark Index (enterprise)

Details
Benchmark suite Current: 57320a2 Previous: 6d786dc Ratio
Add one schema (0 existing) 37 ms 48 ms 0.77
Add one schema (100 existing) 237 ms 300 ms 0.79
Add one schema (1000 existing) 2453 ms 2827 ms 0.87

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark Index (community)

Details
Benchmark suite Current: 57320a2 Previous: 6d786dc Ratio
Add one schema (0 existing) 35 ms 42 ms 0.83
Add one schema (100 existing) 241 ms 280 ms 0.86
Add one schema (1000 existing) 2493 ms 2852 ms 0.87

This comment was automatically generated by workflow using github-action-benchmark.

@jviotti jviotti force-pushed the single-deps branch 5 times, most recently from 5b26980 to 95664a3 Compare March 6, 2026 19:04
Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>
@jviotti jviotti marked this pull request as ready for review March 6, 2026 19:10
@jviotti jviotti changed the title [WIP] Experiment with a single deps.json file Consolidate all dependencies into a single indexed file (to save I/O) Mar 6, 2026
@jviotti jviotti changed the title Consolidate all dependencies into a single indexed file (to save I/O) Consolidate all dependencies into a single indexed file (to reduce I/O) Mar 6, 2026
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 31 files

@jviotti jviotti merged commit 3178dc6 into main Mar 6, 2026
6 checks passed
@jviotti jviotti deleted the single-deps branch March 6, 2026 19:22
@augmentcode
Copy link

augmentcode bot commented Mar 6, 2026

🤖 Augment PR Summary

Summary: This PR experiments with consolidating build dependency tracking into a single output-level file (deps.txt) to simplify incremental runs and reduce per-target sidecar artifacts.

Changes:

  • Replaces per-target .deps files with an in-memory dependency map persisted as deps.txt
  • Extends BuildAdapterFilesystem to load/store both dependency entries and cached file marks across runs; adds flush_dependencies()
  • Updates the indexer to refresh adapter marks when version.json/configuration.json/comment.json content changes
  • Flushes deps.txt during the cleanup phase and tracks it as an output artifact
  • Changes Output::write_json_if_different() to return whether a write occurred (to decide when to refresh marks)
  • Updates CLI and unit tests to expect deps.txt (and to remove expectations for *.deps)

Technical Notes: deps.txt encodes targets (t), static/dynamic deps (s/d), and cached marks (m) to drive cache hits on subsequent runs.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 5 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

dependency = (this->root / dependency).lexically_normal();

const char tag{contents[position]};
const std::string_view value{contents.data() + position + 2,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parsing deps.txt lines via value doesn’t strip a trailing \r, so CRLF files can end up with keys/paths that include \r and won’t match later lookups. This is especially likely on Windows where text-mode writes typically produce CRLF.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

std::stoll(std::string{ns_part})};
const auto mark_value{mark_type{
std::chrono::duration_cast<mark_type::duration>(nanoseconds)}};
this->marks.insert_or_assign(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marks loaded from deps.txt are cached even if the corresponding output file no longer exists; later mark() can return this cached value and build() may skip regenerating a missing destination. That can cascade into failures like Output::track() asserting the file exists even though it was deleted between runs.

Severity: high

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

std::ofstream stream{deps_path};
assert(!stream.fail());

for (const auto &entry : this->dependencies_map) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flush_dependencies() iterates dependencies_map/marks without taking the associated locks; if any thread is still calling write_dependencies()/refresh(), this becomes a data race/UB. If the method is intended to be thread-safe, it likely needs shared locking (or an explicit contract that all work is joined first).

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

dependencies_stream.flush();
dependencies_stream.close();
sourcemeta::core::flush(dependencies_path);
stream.flush();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deps.txt is only stream.flush()’d/closed, but it’s no longer passed through sourcemeta::core::flush like the previous per-target .deps writes were. If incremental rebuild correctness relies on this file surviving crashes/power loss, it may need the same durability treatment.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

// Output files should always have their marks cached
// Only input files or new output files are not
assert(!this->has_previous_run ||
!path.string().starts_with(this->root.string()) ||
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path.string().starts_with(this->root.string()) check can misclassify paths that merely share a prefix (e.g., /out vs /out2) and can be brittle with normalization/case differences. That could make this debug assertion fire for non-output files or miss real output-root paths.

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant